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Abstract 


The purpose of this document is to move the F-RTO (Forward 
RTO-Recovery) functionality for TCP in RFC 4138 from 

Experimental to Standards Track status. The F-RTO support for Stream 
Control Transmission Protocol (SCTP) in RFC 4138 remains with 
Experimental status. See Appendix B for the differences between this 
document and RFC 4138. 


Spurious retransmission timeouts cause suboptimal TCP performance 
because they often result in unnecessary retransmission of the last 
window of data. This document describes the F-RTO detection 
algorithm for detecting spurious TCP retransmission timeouts. F-RTO 
is a TCP sender-only algorithm that does not require any TCP options 
to operate. After retransmitting the first unacknowledged segment 
triggered by a timeout, the F-RTO algorithm of the TCP sender 
monitors the incoming acknowledgments to determine whether the 
timeout was spurious. It then decides whether to send new segments 
or retransmit unacknowledged segments. The algorithm effectively 
helps to avoid additional unnecessary retransmissions and thereby 
improves TCP performance in the case of a spurious timeout. 


Status of This Memo 


This document specifies an Internet standards track protocol for the 
Internet community, and requests discussion and suggestions for 


improvements. Please refer to the current edition of the "Internet 
Official Protocol Standards" (STD 1) for the standardization state 
and status of this protocol. Distribution of this memo is unlimited. 
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Copyright and License Notice 


Copyright (c) 2009 IETF Trust and the persons identified as the 
document authors. All rights reserved. 


This document is subject to BCP 78 and the IETF Trust’s Legal 
Provisions Relating to IETF Documents 
(http://trustee.ietf.org/license-info) in effect on the date of 
publication of this document. Please review these documents 
carefully, as they describe your rights and restrictions with respect 
to this document. Code Components extracted from this document must 
include Simplified BSD License text as described in Section 4.e of 
the Trust Legal Provisions and are provided without warranty as 
described in the BSD License. 


Table of Contents 


ies? cE RO CU CE IA Sessa WA wat aa Ka gt BSS ti WA cared. ESS 3 
Tals: Conventions- -and Terminology "slots weasel ois cas Bs wil BEN ie ee 5 
22s Basis BERT OAL GOLA GAM IIIA 5 
AA WE The 2A GOT CMs she: ca cy giter es ah daira ae kites sah Gh ais AAA AAA AA 5 
22 DISCUSSION. ookee gy brett a ae AI vendo a a tate wile eoes oie Mees ara orate 7 
3. SACK-Enhanced Version of the F-RTO Algorithm .................... 9 
3e Ee -THRESATGOLIT CAME tong ens ere eo Sge eee Senge waders AA NG OH al Peele hed bE reed 9 
Sa DAS CUSS IA Susie ieee else cen ig set eat AI AAA 11 
4. Taking Actions after Detecting Spurious RTO .................... 11 
54 Evaluation. -of RFG -41 38% EIA ies: ee aie Sapo ie dele be YA 
6.2 Security- Considerations IA EIA 13 
TAMA WI mn ES genae ae a a KA AI IA aa 14 
Appendix A. Discussion of Window-Limited Cases .................... 15 
Appendix B. Changes since RFC 4138 ................................ 16 
References aa fila othe, Ka AA Aa leita: sea anette eke Nave ua Pel AE 16 
Notmative’ References: MA SS eles sion eee. AAA 16 
Informative References ddy ce eee ee ee eee eee ee ee ee ee eens 17 


Sarolahti, et al. Standards Track [Page 2] 


RFC 5682 F-RTO September 2009 


Les 


Introduction 
The Transmission Control Protocol (TCP) [Pos81] has two methods for 
triggering retransmissions. First, the TCP sender relies on incoming 


duplicate acknowledgments (ACKs), which indicate that the receiver is 
missing some of the data. After a required number of successive 
duplicate ACKs have arrived at the sender, it retransmits the first 
unacknowledged segment [APB09] and continues with a loss recovery 
algorithm such as NewReno [FHG04] or SACK-based (Selective 
Acknowledgment) loss recovery [BAFWO03]. Second, the TCP sender 
maintains a retransmission timer that triggers retransmission of 
segments, if they have not been acknowledged before the 
retransmission timeout (RTO) occurs. When the retransmission timeout 
occurs, the TCP sender enters the RTO recovery where the congestion 
window is initialized to one segment and unacknowledged segments are 
retransmitted using the slow-start algorithm. The retransmission 
timer is adjusted dynamically, based on the measured round-trip times 
[PAOO]. 


It has been pointed out that the retransmission timer can expire 
spuriously and cause unnecessary retransmissions when no segments 
have been lost [LK00, GLO2, LM03]. After a spurious retransmission 
timeout, the late acknowledgments of the original segments arrive at 
the sender, usually triggering unnecessary retransmissions of a whole 
window of segments during the RTO recovery. Furthermore, after a 
spurious retransmission timeout, a conventional TCP sender increases 
the congestion window on each late acknowledgment in slow start. 

This injects a large number of data segments into the network within 
one round-trip time, thus violating the packet conservation principle 
[Jac88]. 


There are a number of potential reasons for spurious retransmission 
timeouts. First, some mobile networking technologies involve sudden 
delay spikes on transmission because of actions taken during a hand- 
off. Second, a hand-off may take place from a low latency path toa 
high latency path, suddenly increasing the round-trip time beyond the 
current RTO value. Third, on a low-bandwidth link the arrival of 
competing traffic (possibly with higher priority), or some other 
change in available bandwidth, can cause a sudden increase of the 
round-trip time. This may trigger a spurious retransmission timeout. 
A persistently reliable link layer can also cause a sudden delay when 
a data frame and several retransmissions of it are lost for some 
reason. This document does not distinguish between the different 
causes of such a delay spike. Rather, it discusses the spurious 
retransmission timeouts caused by a delay spike in general. 
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This document describes the F-RTO detection algorithm for TCP. It is 
based on the detection mechanism of the "Forward RTO-Recovery" 

(F-RTO) algorithm [SKRO3] that is used for detecting spurious 
retransmission timeouts and thus avoids unnecessary retransmissions 
following the retransmission timeout. When the timeout is not 
spurious, the F-RTO algorithm reverts back to the conventional RTO 
recovery algorithm, and therefore has similar behavior and 
performance. In contrast to alternative algorithms proposed for 
detecting unnecessary retransmissions (Eifel [LK00, LM03] and DSACK- 
based (Duplicate SACK) algorithms [BA04]), F-RTO does not require any 
TCP options for its operation, and it can be implemented by modifying 
only the TCP sender. The Eifel algorithm uses TCP timestamps [BBJ92] 
for detecting a spurious timeout upon arrival of the first 
acknowledgment after the retransmission. The DSACK-based algorithms 
require that the TCP Selective Acknowledgment Option [MMFR96], with 
the DSACK extension [FMMP00], is in use. With DSACK, the TCP 
receiver can report if it has received a duplicate segment, enabling 
the sender to detect afterwards whether it has retransmitted segments 
unnecessarily. The F-RTO algorithm only attempts to detect and avoid 
unnecessary retransmissions after an RTO. Eifel and DSACK can also 
be used for detecting unnecessary retransmissions caused by other 
events, such as packet reordering. 


When the retransmission timer expires, the F-RTO sender retransmits 
the first unacknowledged segment as usual [APBO9]. Deviating from 
the normal operation after a timeout, it then tries to transmit new, 
previously unsent data for the first acknowledgment that arrives 
after the timeout, given that the acknowledgment advances the window. 
If the second acknowledgment that arrives after the timeout advances 
the window (i.e., acknowledges data that was not retransmitted), the 
F-RTO sender declares the timeout spurious and exits the RTO 
recovery. However, if either of these two acknowledgments is a 
duplicate ACK, there will not be sufficient evidence of a spurious 
timeout. Therefore, the F-RTO sender retransmits the unacknowledged 
segments in slow start similar to the traditional algorithm. With a 
SACK-enhanced version of the F-RTO algorithm, spurious timeouts may 
be detected even if duplicate ACKs arrive after an RTO 
retransmission. 


This document specifies the F-RTO algorithm for TCP only, replacing 
the F-RTO functionality with TCP in RFC 4138 [SK05] and moving it 
from Experimental to Standards Track status. The algorithm can also 
be applied to the Stream Control Transmission Protocol (SCTP) [Ste07] 
that has acknowledgment and packet retransmission concepts similar to 
TCP. The considerations on applying F-RTO to SCTP are discussed in 
RFC 4138, but the F-RTO support for SCTP remains with Experimental 
status. 
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This document is organized as follows. Section 2 describes the basic 
F-RTO algorithm, and the SACK-enhanced F-RTO algorithm is given in 
Section 3. Section 4 discusses the possible actions to be taken 
after detecting a spurious RTO. Section 5 summarizes the experience 
with F-RTO implementations and the experimental results, and Section 
6 discusses the security considerations. 


1.1. Conventions and Terminology 


The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
"SHOULD", “SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in BCP 14, RFC 2119 
[RFC2119] and indicate requirement levels for protocols. 


2. Basic F-RTO Algorithm 


A timeout is considered spurious if it would have been avoided had 
the sender waited longer for an acknowledgment to arrive [LM03]. 
F-RTO affects the TCP sender behavior only after a retransmission 
timeout. Otherwise, the TCP behavior remains the same. When the 
retransmission timer expires, the F-RTO algorithm monitors incoming 
acknowledgments, and if the TCP sender gets an acknowledgment for a 
segment that was not retransmitted due to the timeout, the F-RTO 


algorithm declares a timeout spurious. The actions taken in response 
to a spurious timeout are not specified in this document, but we 
discuss some alternatives in Section 4. This section introduces the 


algorithm and then discusses the different steps of the algorithm in 
more detail. 


Following the practice used with the Eifel Detection algorithm 
[LM03], we use the "SpuriousRecovery" variable to indicate whether 
the retransmission is declared spurious by the sender. This variable 
can be used as an input for a corresponding response algorithm. With 
F-RTO, the value of SpuriousRecovery can be either SPUR_TO 
(indicating a spurious retransmission timeout) or FALSE (indicating 
that the timeout is not declared spurious and the TCP sender should 
follow the conventional RTO recovery algorithm). In addition, we use 
the "recover" variable specified in the NewReno algorithm [FHG04]. 


2.1. The Algorithm 


A TCP sender implementing the basic F-RTO algorithm MUST take the 
following steps after the retransmission timer expires. If the 
retransmission timer expires again during the execution of the F-RTO 
algorithm, the TCP sender MUST re-start the algorithm processing from 
step 1. If the sender implements some loss recovery algorithm other 
than Reno or NewReno [FHG04], the F-RTO algorithm SHOULD NOT be 
entered when earlier fast recovery is underway. 
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The F-RTO algorithm takes different actions based on whether an 
incoming acknowledgment advances the cumulative acknowledgment point 
for a received in-order segment, or whether it is a duplicate 
acknowledgment to indicate an out-of-order segment. Duplicate 
acknowledgment is defined in [APBO9]. The F-RTO algorithm does not 
specify actions for receiving a segment that neither acknowledges new 
data nor is a duplicate acknowledgment. The TCP sender SHOULD ignore 
such segments and wait for a segment that either acknowledges new 
data or is a duplicate acknowledgment. 


1) When the retransmission timer expires, retransmit the first 
unacknowledged segment and set SpuriousRecovery to FALSE. If the 
TCP sender is already in RTO recovery AND "recover" is larger than 
or equal to SND.UNA (the oldest unacknowledged sequence number 
[Pos81]), do not enter step 2 of this algorithm. Instead, store 
the highest sequence number transmitted so far in variable 
"recover" and continue with slow-start retransmissions following 
the conventional RTO recovery algorithm. 


2) When the first acknowledgment after the RTO retransmission arrives 
at the TCP sender, store the highest sequence number transmitted 
so far in variable "recover". The TCP sender chooses one of the 
following actions, depending on whether the ACK advances the 
window or whether it is a duplicate ACK. 


a) If the acknowledgment is a duplicate ACK, OR the Acknowledgment 
field covers "recover" but not more than "recover", OR the 
acknowledgment does not acknowledge all of the data that was 
retransmitted in step 1, revert to the conventional RTO 
recovery and continue by retransmitting unacknowledged data in 
slow start. Do not enter step 3 of this algorithm. The 
SpuriousRecovery variable remains as FALSE. 


b) Else, if the acknowledgment advances the window AND the 
Acknowledgment field does not cover "recover", transmit up to 
two new (previously unsent) segments and enter step 3 of this 
algorithm. If the TCP sender does not have enough unsent data, 
it can send only one segment. In addition, the TCP sender MAY 
override the Nagle algorithm [Nag84] and immediately send a 
segment if needed. Note that sending two segments in this step 
is allowed by TCP congestion control requirements [APBO9]: an 
F-RTO TCP sender simply chooses different segments to transmit. 


If the TCP sender does not have any new data to send, or the 
advertised window prohibits new transmissions, the recommended 
action is to skip step 3 of this algorithm and continue with 
slow-start retransmissions, following the conventional RTO 
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recovery algorithm. However, alternative ways of handling the 
window-limited cases that could result in better performance 
are discussed in Appendix A. 


3) When the second acknowledgment after the RTO retransmission 
arrives at the TCP sender, the TCP sender either declares the 
timeout spurious, or starts retransmitting the unacknowledged 
segments. 


a) If the acknowledgment is a duplicate ACK, set the congestion 
window to no more than 3 * MSS (where MSS indicates Maximum 
Segment Size), and continue with the slow-start algorithm 
retransmitting unacknowledged segments. The congestion window 
can be set to 3 * MSS, because two round-trip times have 
elapsed since the RTO, and a conventional TCP sender would have 
increased cwnd to 3 during the same time. Leave 
SpuriousRecovery set to FALSE. 


b) If the acknowledgment advances the window (i.e., if it 
acknowledges data that was not retransmitted after the 
timeout), declare the timeout spurious, set SpuriousRecovery to 
SPUR_TO, and set the value of the "recover" variable to SND.UNA 
(the oldest unacknowledged sequence number [Pos81]). 


2.2. Discussion 


The F-RTO sender takes cautious actions when it receives duplicate 
acknowledgments after a retransmission timeout. Because duplicate 
ACKs may indicate that segments have been lost, reliably detecting a 
spurious timeout is difficult due to the lack of additional 
information. Therefore, it is prudent to follow the conventional TCP 
recovery in those cases. 


The condition in step 1 prevents the execution of the F-RTO algorithm 
in case a previous RTO recovery is underway when the retransmission 
timer expires, except in case the retransmission timer expires 
multiple times for the same segment. If the retransmission timer 
expires during an earlier RTO-based loss recovery, acknowledgments 
for retransmitted segments may falsely lead the TCP sender to declare 
the timeout spurious. 


If the first acknowledgment after the RTO retransmission covers the 
"recover" point at algorithm step (2a), there is not enough evidence 
that a non-retransmitted segment has arrived at the receiver after 
the timeout. This is a common case when a fast retransmission is 
lost and has been retransmitted again after an RTO, while the rest of 
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the unacknowledged segments were successfully delivered to the TCP 
receiver before the retransmission timeout. Therefore, the timeout 
cannot be declared spurious in this case. 


If the first acknowledgment after the RTO retransmission does not 
acknowledge all of the data that was retransmitted in step 1, the TCP 
sender reverts to the conventional RTO recovery. Otherwise, a 
malicious receiver acknowledging partial segments could cause the 
sender to declare the timeout spurious in a case where data was lost. 


The TCP sender is allowed to send two new segments in algorithm 
branch (2b) because the conventional TCP sender would transmit two 
segments when the first new ACK arrives after the RTO retransmission. 
If sending new data is not possible in algorithm branch (2b), or if 
the receiver window limits the transmission, the TCP sender has to 
send something in order to prevent the TCP transfer from stalling. 

If no segments were sent, the pipe between sender and receiver might 
run out of segments, and no further acknowledgments would arrive. 
Therefore, in the window-limited case, the recommendation is to 
revert to the conventional RTO recovery with slow-start 
retransmissions. Appendix A discusses some alternative solutions for 
window-limited situations. 


If the retransmission timeout is declared spurious, the TCP sender 
sets the value of the "recover" variable to SND.UNA in order to allow 
fast retransmit [FHGO4]. The "recover" variable was proposed for 
avoiding unnecessary, multiple fast retransmits when the 
retransmission timer expires during fast recovery with NewReno TCP. 
Because the F-RTO sender retransmits only the segment that triggered 
the timeout, the problem of unnecessary multiple fast retransmits 
[FHGO4] cannot occur. Therefore, if three duplicate ACKs arrive at 
the sender after the timeout, they probably indicate a packet loss, 
and thus fast retransmit should be used to allow efficient recovery. 
If there are not enough duplicate ACKs arriving at the sender after a 
packet loss, the retransmission timer expires again and the sender 
enters step 1 of this algorithm. 


When the timeout is declared spurious, the TCP sender cannot detect 
whether the unnecessary RTO retransmission was lost. In principle, 
the loss of the RTO retransmission should be taken as a congestion 
signal. Thus, there is a small possibility that the F-RTO sender 
will violate the congestion control rules, if it chooses to fully 
revert congestion control parameters after detecting a spurious 
timeout. The Eifel Detection algorithm has a similar property, while 
the DSACK option can be used to detect whether the retransmitted 
segment was successfully delivered to the receiver. 
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The F-RTO algorithm has a side effect on the TCP round-trip time 
measurement. Because the TCP sender can avoid most of the 
unnecessary retransmissions after detecting a spurious timeout, the 
sender is able to take round-trip time samples on the delayed 
segments. If the regular RTO recovery was used without TCP 
timestamps, this would not be possible due to the retransmission 
ambiguity. As a result, the RTO is likely to have more accurate and 
larger values with F-RTO than with the regular TCP after a spurious 
timeout that was triggered due to delayed segments. We believe this 
is an advantage in networks that are prone to delay spikes. 


There are some situations where the F-RTO algorithm may not avoid 
unnecessary retransmissions after a spurious timeout. If packet 
reordering or packet duplication occurs on the segment that triggered 
the spurious timeout, the F-RTO algorithm may not detect the spurious 
timeout due to incoming duplicate ACKs. Additionally, if a spurious 
timeout occurs during fast recovery, the F-RTO algorithm often cannot 
detect the spurious timeout because the segments that were 
transmitted before the fast recovery trigger duplicate ACKs. 

However, we consider these cases rare, and note that in cases where 
F-RTO fails to detect the spurious timeout, it retransmits the 
unacknowledged segments in slow start, and thus performs the same as 
the regular RTO recovery. 


3. SACK-Enhanced Version of the F-RTO Algorithm 


This section describes an alternative version of the F-RTO algorithm 
that uses the TCP Selective Acknowledgment Option [MMFR96]. By using 
the SACK option, the TCP sender detects spurious timeouts in most of 
the cases when packet reordering or packet duplication is present. 

If the SACK information acknowledges new data that was not 
transmitted after the RTO retransmission, the sender may declare the 
timeout spurious, even when duplicate ACKs follow the RTO. 


3.1. The Algorithm 


Given that the TCP Selective Acknowledgment Option [MMFR96] is 
enabled for a TCP connection, a TCP sender MAY apply the SACK- 
enhanced F-RTO algorithm. If the sender applies the SACK-enhanced 
F-RTO algorithm, it MUST follow the steps below. This algorithm 
SHOULD NOT be applied if the TCP sender is already in loss recovery 
when a retransmission timeout occurs. 


The steps of the SACK-enhanced version of the F-RTO algorithm are as 
follows. If the retransmission timer expires again during the 
execution of the SACK-enhanced F-RTO algorithm, the TCP sender MUST 
re-start the algorithm processing from step 1. 
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1) 


When the retransmission timer expires, retransmit the first 
unacknowledged segment and set SpuriousRecovery to FALSE. 
Following the recommendation in the SACK specification [MMFR96], 
reset the SACK scoreboard. If "RecoveryPoint" is larger than or 
equal to SND.UNA, do not enter step 2 of this algorithm. Instead, 
set variable "RecoveryPoint" to indicate the highest sequence 
number transmitted so far and continue with slow-start 
retransmissions following the conventional RTO recovery algorithm. 


Wait until the acknowledgment of the data retransmitted due to the 
timeout arrives at the sender. If duplicate ACKs arrive before 
the cumulative acknowledgment for retransmitted data, adjust the 
scoreboard according to the incoming SACK information. Stay in 
step 2 and wait for the next new acknowledgment. If the 
retransmission timeout expires again, go to step 1 of the 
algorithm. When a new acknowledgment arrives, set variable 
"RecoveryPoint" to indicate the highest sequence number 
transmitted so far. 


a) If the Cumulative Acknowledgment field covers "RecoveryPoint" 
but not more than "RecoveryPoint", revert to the conventional 
RTO recovery and set the congestion window to no more than 2 * 
MSS, like a regular TCP would do. Do not enter step 3 of this 
algorithm. 


b) Else, if the Cumulative Acknowledgment field does not cover 
"RecoveryPoint" but is larger than SND.UNA, transmit up to two 
new (previously unsent) segments and proceed to step 3. If the 
TCP sender is not able to transmit any previously unsent data 
-- either due to receiver window limitation or because it does 
not have any new data to send -- the recommended action is to 
refrain from entering step 3 of this algorithm. Rather, 
continue with slow-start retransmissions following the 
conventional RTO recovery algorithm. 


It is also possible to apply some of the alternatives for 
handling window-limited cases discussed in Appendix A. 


The next acknowledgment arrives at the sender. Either a duplicate 
ACK or a new cumulative ACK (advancing the window) applies in this 
step. Other types of ACKs are ignored without any action. 


a) If the Cumulative Acknowledgment field or the SACK information 
covers more than "RecoveryPoint", set the congestion window to 
no more than 3 * MSS and proceed with the conventional RTO 
recovery, retransmitting unacknowledged segments. Take this 
branch also when the acknowledgment is a duplicate ACK and it 
does not acknowledge any new, previously unacknowledged data 
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below "RecoveryPoint" in the SACK information. Leave 
SpuriousRecovery set to FALSE. 


b) If the Cumulative Acknowledgment field or a SACK information in 
the ACK does not cover more than "RecoveryPoint" AND it 
acknowledges data that was not acknowledged earlier (either 
with cumulative acknowledgment or using SACK information), 
declare the timeout spurious and set SpuriousRecovery to 
SPUR_TO. The retransmission timeout can be declared spurious, 
because the segment acknowledged with this ACK was transmitted 
before the timeout. 


If there are unacknowledged holes between the received SACK 
information, those segments are retransmitted similarly to the 
conventional SACK recovery algorithm [BAFWO03]. If the algorithm 
exits with SpuriousRecovery set to SPUR TO, "RecoveryPoint" is set to 
SND.UNA, thus allowing fast recovery on incoming duplicate 
acknowledgments. 


3.2. Discussion 


The SACK-enhanced algorithm works on the same principle as the basic 
algorithm, but by utilizing the additional information from the SACK 
option. When a genuine retransmission timeout occurs during a steady 
state of a connection, it can be assumed that there are no segments 
left in the pipe. Otherwise, the acknowledgments triggered by these 
segments would have triggered the SACK loss recovery or transmission 
of new segments. Therefore, if the F-RTO sender receives 
acknowledgments for segments transmitted before the retransmission 
timeout in response to the two new segments sent at the algorithm 
step 2, the normal operation of TCP has been just delayed, and the 
retransmission timeout is considered spurious. Note that this 
reasoning works only when the TCP sender is not in loss recovery at 
the time the retransmission timeout occurs. The condition in step 1 
checking that "RecoveryPoint" is larger than or equal to SND.UNA 
prevents the execution of the F-RTO algorithm in case a previous loss 
recovery, either RTO recovery or SACK loss recovery, is underway when 
the retransmission timer expires. It, however, allows the execution 
of the F-RTO algorithm, if the retransmission timer expires multiple 
times for the same segment. 


4. Taking Actions after Detecting Spurious RTO 


Upon a retransmission timeout, a conventional TCP sender assumes that 
outstanding segments are lost and starts retransmitting the 
unacknowledged segments. When the retransmission timeout is detected 
to be spurious, the TCP sender should not continue retransmitting 
based on the timeout. For example, if the sender was in congestion 
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avoidance phase transmitting new, previously unsent segments, it 
should continue transmitting previously unsent segments in congestion 
avoidance. 


There are currently two alternatives specified for a spurious timeout 
response algorithm, the Eifel Response Algorithm [LGO5], and an 
algorithm for adapting the retransmission timeout after a spurious 
RTO [BBA06]. If no specific response algorithm is implemented, the 
TCP SHOULD respond to spurious timeout conservatively, applying the 
TCP congestion control specification [APB09]. Different response 
algorithms for spurious retransmission timeouts have been analyzed in 
some research papers [GL03, Sar03] and IETF documents [SL03]. 


5. Evaluation of RFC 4138 


F-RTO was first specified in an Experimental RFC (RFC 4138) that has 
been implemented in a number of operating systems since it was 
published. Gained experience has been documented in a separate 
document [KYHSO7], and can be summarized as follows. 


If the TCP sender employs F-RTO, it is able to detect spurious RTOs 
and avoid the unnecessary retransmission of the whole window of data. 
Because F-RTO avoids the unnecessary retransmissions after a spurious 
RTO, it is able to adhere to the packet conservation principle, 
unlike a regular TCP that enters the slow-start recovery 
unnecessarily and inappropriately restarts the ACK clock while there 
are segments outstanding in the network. When a spurious RTO has 
been detected, a sender can select an appropriate congestion control 
response instead of setting the congestion window to one segment. 
Because F-RTO avoids unnecessary retransmissions, it is able to take 
the round-trip time of the delayed segments into account when 
calculating the RTO estimate, which may help in avoiding further 
spurious retransmission timeouts. 


Experimental results with the basic F-RTO have been reported in an 


emulated network using a Linux implementation [SKRO3]. Also, 
different congestion control responses along with the SACK-enhanced 
version of F-RTO were tested in a similar environment [Sar03]. There 


are publications analyzing F-RTO performance over commercial Wideband 
Code Division Multiple Access (W-CDMA) networks, and in an emulated 
High-Speed Downlink Packet Access (HSDPA) network [Yam05, Hok05]. 
Also, Microsoft reported positive experiences with their 
implementation of F-RTO at the IETF-68 meeting. 


It is known that some spurious RTOs may remain undetected by F-RTO if 
duplicate acknowledgments arrive at the sender immediately after the 
spurious RTO, for example due to packet reordering or packet loss. 
There are rare corner cases where F-RTO could "hide" a packet loss 
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and therefore lead to inappropriate behavior with non-conservative 
congestion control response: first, if a massive packet reordering 
occurred so that the acknowledgment of RTO retransmission arrived at 
the sender before the acknowledgments of original transmissions, the 
sender might not detect the loss of the segment that triggered the 
RTO. Second, a malicious receiver could lead F-RTO to make a wrong 
conclusion after an RTO by acknowledging segments it has not 
received. Such a receiver would, however, risk breaking the 
consistency of the TCP state between the sender and receiver, causing 
the connection to become unusable, which cannot be of any benefit to 
the receiver. Therefore, we believe it is not likely that receivers 
would start employing such tricks on a significant scale. Finally, 
loss of the unnecessary RTO retransmission cannot be detected without 
using some explicit acknowledgment scheme such as DSACK. This is 
common to the other mechanisms for detecting spurious RTO, as well as 
to regular TCP that does not use DSACK. We note that if the 
congestion control response to spurious RTO is conservative enough, 
the above corner cases do not cause problems due to increased 
congestion. 


6. Security Considerations 


The main security threat regarding F-RTO is the possibility that a 
receiver could mislead the sender into setting too large a congestion 
window after an RTO. There are two possible ways a malicious 
receiver could trigger a wrong output from the F-RTO algorithm. 
First, the receiver can acknowledge data that it has not received. 
Second, it can delay acknowledgment of a segment it has received 
earlier, and acknowledge the segment after the TCP sender has been 
deluded to enter algorithm step 3. 


If the receiver acknowledges a segment it has not really received, 
the sender can be led to declare spurious timeout in the F-RTO 
algorithm, step 3. However, because the sender will have an 
incorrect state, it cannot retransmit the segment that has never 
reached the receiver. Therefore, this attack is unlikely to be 
useful for the receiver to maliciously gain a larger congestion 
window. 


A common case for a retransmission timeout is that a fast 
retransmission of a segment is lost. If all other segments have been 
received, the RTO retransmission causes the whole window to be 
acknowledged at once. This case is recognized in F-RTO algorithm 
branch (2a). However, if the receiver only acknowledges one segment 
after receiving the RTO retransmission, and then the rest of the 
segments, it could cause the timeout to be declared spurious when it 
is not. Therefore, it is suggested that, when an RTO occurs during 
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the fast recovery phase, the sender would not fully revert the 
congestion window even if the timeout was declared spurious. 
Instead, the sender would reduce the congestion window to 1. 


If there is more than one segment missing at the time of a 
retransmission timeout, the receiver does not benefit from misleading 
the sender to declare a spurious timeout because the sender would 
have to go through another recovery period to retransmit the missing 
segments, usually after an RTO has elapsed. 
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Appendix A. Discussion of Window-Limited Cases 


When the advertised window limits the transmission of two new 
previously unsent segments, or there are no new data to send, it is 
recommended in F-RTO algorithm step (2b) that the TCP sender continue 
with the conventional RTO recovery algorithm. The disadvantage is 
that the sender may continue unnecessary retransmissions due to 
possible spurious timeout. This section briefly discusses the 
options that can potentially improve performance when transmitting 
previously unsent data is not possible. 


-— The TCP sender could reserve an unused space of a size of one or 
two segments in the advertised window to ensure the use of 
algorithms such as F-RTO or Limited Transmit [ABF01] in receiver 
window-limited situations. On the other hand, while doing this, 
the TCP sender should ensure that the window of outstanding 
segments is large enough for proper utilization of the available 
pipe. 


- Use additional information if available, e.g., TCP timestamps with 
the Eifel Detection algorithm, for detecting a spurious timeout. 
However, Eifel detection may yield different results from F-RTO 
when ACK losses and an RTO occur within the same round-trip time 
[SKRO3]. 


— Retransmit data from the tail of the retransmission queue and 
continue with step 3 of the F-RTO algorithm. It is possible that 
the retransmission will be made unnecessarily. Furthermore, the 
operation of the SACK-based F-RTO algorithm would need to consider 
this case separately, to not use the retransmitted segment to 
indicate spurious timeout. Given these considerations, this option 
is not recommended. 


-— Send a zero-sized segment below SND.UNA, similar to a TCP Keep- 
Alive probe, and continue with step 3 of the F-RTO algorithm. 
Because the receiver replies with a duplicate ACK, the sender is 
able to detect whether the timeout was spurious from the incoming 
acknowledgment. This method does not send data unnecessarily, but 
it delays the recovery by one round-trip time in cases where the 
timeout was not spurious. Therefore, this method is not 
encouraged. 


- In receiver-limited cases, send one octet of new data, regardless 
of the advertised window limit, and continue with step 3 of the 
F-RTO algorithm. It is possible that the receiver will have free 
buffer space to receive the data by the time the segment has 


Sarolahti, et al. Standards Track [Page 15] 


RFC 5682 F-RTO September 2009 


propagated through the network, in which case no harm is done. If 
the receiver is not capable of receiving the segment, it rejects 
the segment and sends a duplicate ACK. 


Appendix B. Changes since RFC 4138 


Changes from RFC 4138 are summarized below, apart from minor 
editing and language improvements. 


* Modified the basic F-RTO algorithm and the SACK-enhanced F-RTO 
algorithm to prevent the TCP sender from applying the F-RTO 
algorithm if the retransmission timer expires when an earlier RTO 
recovery is underway, except when the retransmission timer expires 
multiple times for the same segment. 


* Clarified behavior on multiple timeouts. 


* Added a paragraph on acknowledgments that do not acknowledge new 
data but are not duplicate acknowledgments. 


* Clarified the SACK-algorithm a bit, and added one paragraph of 
description of the basic idea of the algorithm. 


* Removed SCTP considerations. 


* Removed earlier Appendix sections, except Appendix C from RFC 4138, 
which is now Appendix A. 


* Clarified text about the possible response algorithms. 

* Added section that summarizes the evaluation of RFC 4138. 
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